The dataset airmiles is a time series of the miles flown annually by commercial airlines in the US from 1937 to 1960.
Before plotting the graph, think about what shape you would expect it to have. Plot the series and comment on the differences between what you get and your expectations.
Which aspect ratio conveys the information you find in the series best?
Do you think the graph looks better as a line graph (as suggested on the R help page for the dataset) or with points as well?
Might plotting a transformation help you to look more closely at the early years or would zooming in be sufficient?
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 412 1580 6431 10528 17532 30514
The Beveridge index of wheat prices covers almost four hundred years of Euro- pean history from 1500 to 1869 and is available in the dataset bev in tseries.
Plot the series and explain why you have decided to plot it in that way.
Are there any particular features in the series which stand out? How would you summarise the information in the series in words?
Manyimportanthistoricaleventstookplaceoverthistimeperiod,including the Thirty Years’ War, the English Civil War, and the Napoleonic Wars. Is there any evidence of any of these having an effect on the index?
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 11.0 64.0 98.0 107.9 143.8 381.0
## answer
# 3. Goals in soccer games The Bundesliga dataset was used in §11.2.
Plot graphs of the rates of home and away goals per game over the seasons in the same plot. What limits do you recommend for the vertical scale?
Other possibilities for studying the home and away goal rates per game include plotting the differences or ratios over time and drawing a scatterplot of one rate against another. Is there any information in these graphics that is shown better by one than the others?
Can you find equivalent data for the top soccer league in your own country and are there similar patterns over the years?
## 'data.frame': 14018 obs. of 7 variables:
## $ HomeTeam : Factor w/ 52 levels "1. FC Kaiserslautern",..: 50 27 33 18 28 4 43 37 12 3 ...
## $ AwayTeam : Factor w/ 52 levels "1. FC Kaiserslautern",..: 12 3 24 1 31 2 17 45 43 50 ...
## $ HomeGoals: int 3 1 1 1 1 0 1 2 3 3 ...
## $ AwayGoals: int 2 1 1 1 4 2 1 0 3 0 ...
## $ Round : int 1 1 1 1 1 1 1 1 2 2 ...
## $ Year : int 1963 1963 1963 1963 1963 1963 1963 1963 1963 1963 ...
## $ Date : POSIXt, format: "1963-08-24 17:30:00" "1963-08-24 17:30:00" ...
Important early demographic analyses were carried out on English data from the seventeenth century. The Arbuthnot dataset in the HistData package includes data on the numbers of male and female christenings in London from 1629 to 1710.
Plot the number of male christenings over time. Which features stand out?
Why do you think there was a low level of christenings from around the mid-1640’s to 1660?
Two low outliers stand out, in 1666, presumably because of the Great Fire of London and the plague the previous year, and in 1704. A possible explanation for the 1704 outlier is given on the R help page for the dataset. Compare the data values for 1674 and 1704 to check the explanation.
## 'data.frame': 82 obs. of 7 variables:
## $ Year : int 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 ...
## $ Males : int 5218 4858 4422 4994 5158 5035 5106 4917 4703 5359 ...
## $ Females : int 4683 4457 4102 4590 4839 4820 4928 4605 4457 4952 ...
## $ Plague : int 0 1317 274 8 0 1 0 10400 3082 363 ...
## $ Mortality: int 8771 10554 8562 9535 8393 10400 10651 23359 11763 13624 ...
## $ Ratio : num 1.11 1.09 1.08 1.09 1.07 ...
## $ Total : num 9.9 9.31 8.52 9.58 10 ...
Consider the numbers of goals scored by each team.
How would you plot the annual average goals per home game for each team in the Bundesliga over the 46 seasons in the dataset? Would you choose a single graphic or a trellis display? Only one team has been a member of the Bundesliga ever since it started, Hamburg. How do you think the time series of teams with incomplete records should be displayed?
You could compare the annual home and away scoring rates of particular teams by plotting the two time series on the same display or by drawing a scatterplot of one variable against the other. Using the two teams Hamburg and Bayern Munich, comment on which display you think is better. Do the displays provide different kinds of information? # 6. Deaths by horsekick Plot separate displays for each of the 14 corps in the von Bortkiewicz dataset (VonBort in vcd).
Do any of the patterns stand out as different?
11 of the 14 corps had no deaths in the first year (1875). Could this be worth looking into?
Answer: - a:I think the last name of the data frame is better to show this kind of data and have an idea that the heat map can also show this kind of data structure
## `summarise()` has grouped output by 'Year'. You can override using the `.groups` argument.
## `summarise()` has grouped output by 'Year'. You can override using the `.groups` argument.
Plot separate displays for each of the 14 corps in the von Bortkiewicz dataset (VonBort in vcd).
## Loading required package: grid
## Warning in plot.xy(xy, type, ...): plot type 'line' will be truncated to first
## character
## Warning in plot.xy(xy, type, ...): plot type 'line' will be truncated to first
## character
## Warning in plot.xy(xy, type, ...): plot type 'line' will be truncated to first
## character
## Warning in plot.xy(xy, type, ...): plot type 'line' will be truncated to first
## character
## Warning in plot.xy(xy, type, ...): plot type 'line' will be truncated to first
## character
## Warning in plot.xy(xy, type, ...): plot type 'line' will be truncated to first
## character
## Warning in plot.xy(xy, type, ...): plot type 'line' will be truncated to first
## character
## Warning in plot.xy(xy, type, ...): plot type 'line' will be truncated to first
## character
## Warning in plot.xy(xy, type, ...): plot type 'line' will be truncated to first
## character
## Warning in plot.xy(xy, type, ...): plot type 'line' will be truncated to first
## character
## Warning in plot.xy(xy, type, ...): plot type 'line' will be truncated to first
## character
## Warning in plot.xy(xy, type, ...): plot type 'line' will be truncated to first
## character
## Warning in plot.xy(xy, type, ...): plot type 'line' will be truncated to first
## character
## Warning in plot.xy(xy, type, ...): plot type 'line' will be truncated to first
## character
Do any of the patterns stand out as different?
11 of the 14 corps had no deaths in the first year(1875).Could this be worth looking into?
Anwser:
a:No. XIV1875 is 1
b:i think it just special case
The package ggplot2 includes a dataset of five US economic indicators recorded monthly over about 40 years, economics.
## spec_tbl_df [574 × 6] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
## $ date : Date[1:574], format: "1967-07-01" "1967-08-01" ...
## $ pce : num [1:574] 507 510 516 512 517 ...
## $ pop : num [1:574] 198712 198911 199113 199311 199498 ...
## $ psavert : num [1:574] 12.6 12.6 11.9 12.9 12.8 11.8 11.7 12.3 11.7 12.3 ...
## $ uempmed : num [1:574] 4.5 4.7 4.6 4.9 4.7 4.8 5.1 4.5 4.1 4.6 ...
## $ unemploy: num [1:574] 2944 2945 2958 3143 3066 ...
The dataset bomregions in the DAAG package includes seven regional time se- ries of annual rain in Australia and one time series averaged over the country.
## 'data.frame': 109 obs. of 22 variables:
## $ Year : num 1900 1901 1902 1903 1904 ...
## $ eastAVt : num NA NA NA NA NA NA NA NA NA NA ...
## $ seAVt : num NA NA NA NA NA NA NA NA NA NA ...
## $ southAVt : num NA NA NA NA NA NA NA NA NA NA ...
## $ swAVt : num NA NA NA NA NA NA NA NA NA NA ...
## $ westAVt : num NA NA NA NA NA NA NA NA NA NA ...
## $ northAVt : num NA NA NA NA NA NA NA NA NA NA ...
## $ mdbAVt : num NA NA NA NA NA NA NA NA NA NA ...
## $ auAVt : num NA NA NA NA NA NA NA NA NA NA ...
## $ eastRain : num 430 500 315 694 565 ...
## $ seRain : num 603 511 421 628 551 ...
## $ southRain: num 375 314 284 421 388 ...
## $ swRain : num 738 559 542 729 711 ...
## $ westRain : num 400 323 363 377 418 ...
## $ northRain: num 360 476 345 601 604 ...
## $ mdbRain : num 413 365 256 525 448 ...
## $ auRain : num 369 402 317 519 505 ...
## $ SOI : num -5.55 0.992 0.458 4.933 4.35 ...
## $ co2mlo : num NA NA NA NA NA NA NA NA NA NA ...
## $ co2law : num 296 296 296 297 297 ...
## $ CO2 : num 296 297 297 297 298 ...
## $ sunspot : num 9.5 2.7 5 24.4 42 63.5 53.8 62 48.5 43.9 ...
##
## Attaching package: 'dplR'
## The following object is masked from 'package:zoo':
##
## time<-
## Classes 'rwl' and 'data.frame': 1358 obs. of 34 variables:
## $ CAM011: num NA NA NA NA NA NA NA NA NA NA ...
## $ CAM021: num NA NA NA NA NA NA NA NA NA NA ...
## $ CAM031: num NA NA NA NA NA NA NA NA NA NA ...
## $ CAM032: num NA NA NA NA NA NA NA NA NA NA ...
## $ CAM041: num NA NA NA NA NA NA NA NA NA NA ...
## $ CAM042: num NA NA NA NA NA NA NA NA NA NA ...
## $ CAM051: num NA NA NA NA NA NA NA NA NA NA ...
## $ CAM061: num NA NA NA NA NA NA NA NA NA NA ...
## $ CAM062: num NA NA NA NA NA NA NA NA NA NA ...
## $ CAM071: num NA NA NA NA NA NA NA NA NA NA ...
## $ CAM072: num NA NA NA NA NA NA NA NA NA NA ...
## $ CAM081: num NA NA NA NA NA NA NA NA NA NA ...
## $ CAM082: num NA NA NA NA NA NA NA NA NA NA ...
## $ CAM091: num NA NA NA NA NA NA NA NA NA NA ...
## $ CAM092: num NA NA NA NA NA NA NA NA NA NA ...
## $ CAM101: num NA NA NA NA NA NA NA NA NA NA ...
## $ CAM102: num NA NA NA NA NA NA NA NA NA NA ...
## $ CAM111: num NA NA NA NA NA NA NA NA NA NA ...
## $ CAM112: num NA NA NA NA NA NA NA NA NA NA ...
## $ CAM121: num NA NA NA NA NA NA NA NA NA NA ...
## $ CAM122: num NA NA NA NA NA NA NA NA NA NA ...
## $ CAM131: num NA NA NA NA NA NA NA NA NA NA ...
## $ CAM132: num NA NA NA NA NA NA NA NA NA NA ...
## $ CAM141: num NA NA NA NA NA NA NA NA NA NA ...
## $ CAM151: num NA NA NA NA NA NA NA NA NA NA ...
## $ CAM152: num NA NA NA NA NA NA NA NA NA NA ...
## $ CAM161: num NA NA NA NA NA NA NA NA NA NA ...
## $ CAM162: num NA NA NA NA NA NA NA NA NA NA ...
## $ CAM171: num NA NA NA NA NA NA NA NA NA NA ...
## $ CAM172: num NA NA NA NA NA NA NA NA NA NA ...
## $ CAM181: num NA NA NA NA NA NA NA NA NA NA ...
## $ CAM191: num NA NA NA NA NA NA NA NA NA NA ...
## $ CAM201: num NA NA NA NA NA NA NA NA NA NA ...
## $ CAM211: num 0.17 0.13 0.14 0.19 0.22 0.27 0.31 0.22 0.28 0.34 ...
Salvador Dali’s painting The Persistence of Memory is in the New York Mu- seum of Modern Art. Do you think the distorted clocks could be interpreted as alternative models of time series?
Anwesr: I think it is possible, in that multiple graphics can be combined to represent the painting in an integrated way